331 research outputs found

    Cover tree based dynamization of clustering algorithms

    Get PDF
    openIn questo lavoro, i Cover Tree sono l'obiettivo principale e servono come struttura di dati per memorizzare in modo efficiente i dati. Li utilizziamo per gestire dinamicamente il k-Center problem, sia con che senza outlier. La struttura Cover Tree Ú progettata per recuperare un coreset, una rappresentazione molto piccola dei dati, che viene poi fornito a un algoritmo di clustering offline per ottenere rapidamente una soluzione per l'intero set di dati. Rispetto alla definizione originale, il Cover Tree implementato viene aumentato con nuovi campi, per mantenere informazioni aggiuntive cruciali per l'estrazione di coreset ragionevoli. Le soluzioni ottenibili per i problemi citati sono approssimazioni (α + Δ), dove α rappresenta la migliore approssimazione nota ottenibile in tempo polinomiale nell'impostazione standard offline e Δ>0 Ú un parametro di precisione fornito dall'utente. L'obiettivo principale dell'utilizzo di una struttura dati dinamica Ú quello di ottenere una soluzione ragionevole, rispetto a quella ottenuta applicando gli algoritmi di clustering da zero a tutti i dati. Per verificare la qualità della nostra soluzione, conduciamo una serie di esperimenti per valutarne le prestazioni e mettere a punto i parametri coinvolti.In this work, cover trees are the main focus, and they serve as a data structure to efficiently store metric data. We utilize them for dynamically handling the k-center problem, both with and without outliers. The cover tree data structure is designed to retrieve a coreset, a very succinct summary of the data, which is then fed to an offline clustering algorithm to quickly obtain a solution for the whole dataset. With respect to the original definition, the cover tree implemented is augmented, to maintain additional information crucial for extracting reasonable coresets. The solutions obtainable for the mentioned problems are (α + Δ)-approximations, where α represents the best-known approximation achievable in polynomial time in the standard offline setting, and Δ>0 is a user-provided accuracy parameter. The main objective in using a dynamic data structure is to obtain a reasonable solution, in comparison to the solution obtained by applying the clustering algorithms from scratch to all the data points. To ascertain the quality of our solution, we conduct a series of experiments to evaluate its performance and to fine-tune the involved parameters

    α-MON: Anonymized Passive Traffic Monitoring

    Get PDF
    Packet measurements are essential for several applications, such as cyber-security, accounting and troubleshooting. They, however, threaten privacy by exposing sensitive information. Anonymization has been the answer to this challenge, i.e., replacing sensitive information by obfuscated copies. Anonymization of packet traces, however, comes with some drawbacks. First, it reduces the value of data. Second, it requires to consider diverse protocols because information may leak from many non-encrypted fields. Third, it must be performed at high speeds directly at the monitor, to prevent private data from leaking, calling for real-time solutions.We present α-MON, a flexible tool for privacy-preserving packet monitoring. It replicates input packet streams to different consumers while anonymizing values according to flexible policies that cover all protocol layers. Beside classic anonymization mechanisms such as IP address obfuscation, α-MON supports α-anonymization, a novel solution to obfuscate values that can be uniquely traced back to limited sets of users. Differently from classic anonymization approaches, α-anonymity works on a streaming fashion, with zero delay, operating at high-speed links on a packet-by-packet basis. We evaluate α-MON performance using packet traces collected from an ISP network. Results show that it enables α-anonymity in real-time. α-MON is available to the community as an open-source project

    α-MON: Traffic Anonymizer for Passive Monitoring

    Get PDF
    Packet measurements at scale are essential for several applications, such as cyber-security, accounting and troubleshooting. They, however, threaten users’ privacy by exposing sensitive information. Anonymization has been the answer to this challenge, i.e., replacing sensitive information with obfuscated copies. Anonymization of packet traces, however, comes with some challenges and drawbacks. First, it reduces the value of data. Second, it requires to consider diverse protocols because information may leak from many non-encrypted fields. Third, it must be performed at high speeds directly at the monitor, to prevent private data from leaking, calling for real-time solutions. We present , a flexible tool for privacy-preserving packet monitoring. It replicates input packet streams to different consumers while anonymizing protocol fields according to flexible policies that cover all protocol layers. Beside classic anonymization mechanisms such as IP address obfuscation, supports z-anonymization, a novel solution to obfuscate rare values that can be uniquely traced back to limited sets of users. Differently from classic anonymization approaches, works on a streaming fashion, with zero delay, operating at high-speed links on a packet-by-packet basis. We quantify the impact of on traffic measurements, finding that it introduces minimal error when it comes to finding heavy-hitter services. We evaluate performance using packet traces collected from an ISP network and show that it achieves a sustainable rate of 40 Gbit/s on a Commercial Off-the Shelf server. is available to the community as an open-source project

    Privacy-preserving network monitoring at high-speed

    Get PDF
    Network monitoring represents a key step for several applications, such as cyber-security and traffic engineering. Examples of the data include packet traces captured in the network and log files obtained from services like the DNS and BGP. It is widely known that monitoring may expose privacy-sensitive information. Deep packet inspection, for example, exposes the destination servers contacted by users, and non-encrypted fields of certain protocols, such as Service Name Indication (SNI) in TLS handshakes. New privacy regulations (e.g. GDPR) impose strict rules when handling data that carry privacy-sensitive information. They guarantee the protection of personal data, provide the interested parties certain rights, and assign powers to the regulators to enforce them. As network monitoring data carries information that reveals users' identity, it must be treated in the light of these regulations. Network monitoring infrastructure must guarantee that sensitive information is not leaked or, preferably, must not collect any unnecessary data that may threat users' privacy. Historically, the solution to these problems has been anonymization -- i.e., replacing sensitive fields with obfuscated copies. This approach however has two drawbacks: First, anonymization reduces the value of the collected information. For instance, while anonymizing client and server IP addresses in traffic logs helps to protect privacy, it renders it impossible to evaluate particular services that could be identified by their server IP addresses. Second, anonymization of protocol fields in isolation is not sufficient, as users' identity might be revealed by subtler techniques. For example, even if one obfuscates the client IP addresses in DNS traffic logs, the set of hostnames resolved by a client (if exposed in the logs) may still help to uncover identities. We are building a flexible tool that exposes to monitors only the information strictly required, thus reducing at the source risks to people's privacy. Our solution satisfies three requirements: (i)~it automatically searches for protocol fields that can be linked to particular users; (ii)~it anonymizes information considering all protocol stack, and uses a stateful approach, employing k-anonymization algorithms; (iii)~it is light-weight and scalable, thus deployable in high-speed links at multiple Gb/s. Our solution is based on the Intel Data Plane Development Kit, a set of libraries and drivers for fast packet processing. We have built a prototype that is deployed in a campus network. At the present, the prototype is able to handle multiple 10~Gb/s links with zero packet losses, performing several anonymization steps on packets. Anonymized packets are forwarded to legacy monitoring systems that receive information already deprived of privacy sensitive fields. We are testing k-anonymization approaches to perform selective anonymization of sensitive fields, such as TLS SNIs and server IP addresses, with the aim to obfuscate only cases in which the information helps to uncover users behind the traffic. In this poster we will present our architecture and system design, as well as show preliminary results of the prototype deployment

    z-anonymity: Zero-Delay Anonymization for Data Streams

    Get PDF
    With the advent of big data and the birth of the data markets that sell personal information, individuals' privacy is of utmost importance. The classical response is anonymization, i.e., sanitizing the information that can directly or indirectly allow users' re-identification. The most popular solution in the literature is the k-anonymity. However, it is hard to achieve k-anonymity on a continuous stream of data, as well as when the number of dimensions becomes high.In this paper, we propose a novel anonymization property called z-anonymity. Differently from k-anonymity, it can be achieved with zero-delay on data streams and it is well suited for high dimensional data. The idea at the base of z-anonymity is to release an attribute (an atomic information) about a user only if at least z - 1 other users have presented the same attribute in a past time window. z-anonymity is weaker than k-anonymity since it does not work on the combinations of attributes, but treats them individually. In this paper, we present a probabilistic framework to map the z-anonymity into the k-anonymity property. Our results show that a proper choice of the z-anonymity parameters allows the data curator to likely obtain a k-anonymized dataset, with a precisely measurable probability. We also evaluate a real use case, in which we consider the website visits of a population of users and show that z-anonymity can work in practice for obtaining the k-anonymity too

    Neue Wege der linguistischen Diskursforschung: computerbasierte Verfahren der Argumentanalyse

    Get PDF
    Der vorliegende Beitrag beschreibt und diskutiert eine neuartige Verbindung quantitativer und qualitativer Verfahren fĂŒr die Analyse von Big Data in der linguistischen Diskursforschung. Der vorgestellte Ansatz kombiniert Methoden der diskurslinguistischen Argumentationsanalyse mit Methoden des Linguistischen Text Mining. Das Ziel der Methodenentwicklung ist ein computergestĂŒtztes Verfahren fĂŒr die semi-automatisierte Identifizierung und Analyse von Argumenten in großen Textkorpora. Erprobt wird das Verfahren an einem Diskurs ĂŒber Infrastrukturmaßnahmen. Im Beitrag werden sprachliche Mittel vorgestellt, die im Korpus gemeinsam auftreten und damit als Merkmale von Argumentmustern betrachtet werden können. Solche Argumentmuster können das Vorkommen von Argumenten und ihren Verwendungsweisen in Texten indizieren

    Stepwise approach for the control and eventual elimination of Taenia solium as a public health problem

    Get PDF
    Background: Taenia solium taeniosis/cysticercosis is a public health and agricultural problem, especially in low-income countries, and has been ranked the top foodborne parasitic hazard globally. In 2012, the World Health Organization published a roadmap that called for a validated strategy for T. solium control and elimination by 2015. This goal has not been met, and validated evidence of effective control or elimination in endemic countries is still incomplete. Measuring and evaluating success of control programmes remains difficult, as locally acceptable targets have not been defined as part of the 2012 roadmap nor from other sources, and the performance of tools to measure effect are limited. Discussion: We believe that an international agreement supported by the tripartite World Health Organization, Food and Agriculture Organization of the United Nations, and World Organisation for Animal Health is needed to facilitate endemic countries in publicising SMART (Specific, Measurable, Achievable/attainable, Relevant, Time-bound) country-level control target goals. These goals should be achievable through locally acceptable adoption of options from within a standardised intervention tool-kit', and progress towards these goals should be monitored using standardised and consistent diagnostics. Several intervention tools are available which can contribute to control of T. solium, but the combination of these - the most effective control algorithm - still needs to be identified. In order to mount control efforts and ensure political commitment, stakeholder engagement and funding, we argue that a stepwise approach, as developed for Rabies control, is necessary if control efforts are to be successful and sustainable. Conclusions: The stepwise approach can provide the framework for the development of realistic control goals of endemic areas, the implementation of intervention algorithms, and the standardised monitoring of the evaluation of the progress towards obtaining the control target goals and eventually elimination

    Frustrated Magnetic Cycloidal Structure and Emergent Potts Nematicity in CaMn2_2P2_2

    Full text link
    We report neutron-diffraction results on single-crystal CaMn2_2P2_2 containing corrugated Mn honeycomb layers and determine its ground-state magnetic structure. The diffraction patterns consist of prominent (1/6, 1/6, LL) reciprocal lattice unit (r.l.u.; LL = integer) magnetic Bragg reflections, whose temperature-dependent intensities are consistent with a first-order antiferromagnetic phase transition at the N\'eel temperature TN=70(1)T_{\rm N} = 70(1) K. Our analysis of the diffraction patterns reveals an in-plane 6×66\times6 magnetic unit cell with ordered spins that in the principal-axis directions rotate by 60-degree steps between nearest neighbors on each sublattice that forms the honeycomb structure, consistent with the PAcP_Ac magnetic space group. We find that a few other magnetic subgroup symmetries (PA2/cP_A2/c, PC2/mP_C2/m, PS1ˉ,PC2,PCm,PS1P_S\bar{1}, P_C2, P_Cm, P_S1) of the paramagnetic P3ˉm11â€ČP\bar{3}m11^\prime crystal symmetry are consistent with the observed diffraction pattern. We relate our findings to frustrated J1J_1-J2J_2-J3J_3 Heisenberg honeycomb antiferromagnets with single-ion anisotropy and the emergence of Potts nematicit

    Strawberry cultivars fruit production and postharvest from two types of saplings

    Get PDF
    The establishment of strawberry crops in southern Brazil is conditioned on the delivery of bare-root saplings imported from Argentina and/or Chile. An alternative to reduce dependence on the acquisition of these saplings is their replacement by clod-rooted saplings that form a clod. However, information on the agronomic performance of clod-rooted saplings is scarce. The aim of this work was to investigate whether the association between types of saplings and strawberry cultivars alters fruit production and postharvest. The treatments were three cultivars (Fronteras, Monterey and Portola) and two types of saplings (bare-root and rooted in a clod), arranged in a randomized block design, with three replications. The productive potential and chemical quality of fruits were evaluated. Plants from saplings rooted in clods showed higher number and fruit production. Plants from bare -root saplings produced larger fruits. The postharvest of fruits was not altered by the treatments. It is concluded that the productive potential and postharvest of fruits of strawberry cultivars is not associated with the types of saplings studied. Regardless of the cultivar, plants from saplings rooted in clods are more productive. The three cultivars tested, of saplings with bare-roots or rooted in clods, present a balanced relationship between sugar and acidity, giving the desired flavor to the fruits in their postharvest period.The establishment of strawberry crops in southern Brazil is conditioned on the delivery of bare-root saplings imported from Argentina and/or Chile. An alternative to reduce dependence on the acquisition of these saplings is their replacement by clod-rooted saplings that form a clod. However, information on the agronomic performance of clod-rooted saplings is scarce. The aim of this work was to investigate whether the association between types of saplings and strawberry cultivars alters fruit production and postharvest. The research was carried out at the Horticulture Sector of the University of Passo Fundo, Passo Fundo, Rio Grande do Sul, Brazil, from March to December 2020, in a greenhouse. The plant material for the research consisted of saplings with bare-roots and saplings rooted in clods. The treatments were three cultivars (‘Fronteras’, ‘Monterey’ and ‘Portola’) and two types of saplings (bare-root and rooted in a clod), arranged in a randomized block design, with three replications and ten plants per plot. The productive potential and chemical quality of fruits were evaluated. Plants from saplings rooted in clods showed higher number and fruit production. Plants from bare -root saplings produced larger fruits. The postharvest of fruits was not altered by the treatments. The productive potential and postharvest of fruits of strawberry cultivars are not associated with the different types of saplings studied. Regardless of the cultivar used, plants from saplings rooted in clods have greater productive potential compared to plants from bare-root saplings. The fruits of the three cultivars tested in this study, from saplings with bare-roots or rooted in clods, present a balanced relationship between sugar and acidity, which makes the strawberries suitable for consumption

    Tests of model predictions for the responses of stellar spectra and absorption-line indices to element abundance variations. Tests of model predictions for the responses of stellar spectra and absorption-line indices to element abundance variations.

    Get PDF
    A method that is widely used to analyse stellar populations in galaxies is to apply the theoretically derived responses of stellar spectra and line indices to element abundance variations, which are hereafter referred to as response functions. These are applied in a differential way, to base models, in order to generate spectra or indices with different abundance patterns. In this paper, sets of such response functions for three different stellar evolutionary stages are tested with new empirical [Mg/Fe] abundance data for the medium-resolution Isaac Newton Telescope library of empirical spectra (MILES). Recent theoretical models and observations are used to investigate the effects of [Fe/H], [Mg/H] and overall [Z/H] on spectra, via ratios of spectra for similar stars. The global effects of changes in abundance patterns are investigated empirically through direct comparisons of similar stars from MILES, highlighting the impact of abundance effects in the blue part of the spectrum, particularly for lower temperature stars. It is found that the relative behaviour of iron-sensitive line indices are generally well predicted by response functions, whereas Balmer line indices are not. Other indices tend to show large scatter about the predicted mean relations. Implications for element abundance and age studies in stellar populations are discussed and ways forward are suggested to improve the match with the behaviour of spectra and line-strength indices observed in real stars
    • 

    corecore